AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Convolution-enhanced ViT

# Convolution-enhanced ViT

Cvt W24 384 22k
Apache-2.0
CvT-w24 is a vision transformer model pre-trained on ImageNet-22k and fine-tuned at 384x384 resolution, improving traditional vision transformers through convolutional enhancements.
Image Classification Transformers
C
microsoft
66
0
Cvt 13
Apache-2.0
CvT-13 is a hybrid architecture model combining convolutional neural networks and vision transformers, pre-trained on the ImageNet-1k dataset, suitable for image classification tasks.
Image Classification Transformers
C
microsoft
21.80k
11
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase